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Abstract 

Background, Aim and Scope. Quite often there is need for precise 
and representative parameters in LCA studies. Probably the most 
relevant have direct influence on the functional unit, whose defi¬ 
nition is crucial in the conduct of any LCA. Changes in the func¬ 
tional unit show directly in LCI and LCIA results. In comparative 
assertions, a bias in the functional unit may lead to a bias in the 
overall conclusions. Since quantitative data for the functional unit, 
such as geometric dimensions and specific weight, often vary, the 
question arises how to determine the functional unit, especially if 
a comparative assertion shall be representative for a region or 
market. Aim and scope of the study is to develop and apply meth¬ 
ods for obtaining precise and representative estimates for the func¬ 
tional unit as one important parameter in an LCA study. 

Materials and Methods. Statistical sampling is applied in order to 
get empirical estimates for the weight of yoghurt cups, as a typical 
parameter for the functional unit. We used a two-stage sampling 
design, with stratified sampling in the first stage and three different 
sampling designs in the second stage, namely stratified, clustered, 
and a posteriori sampling. Sampling designs are motivated and de¬ 
scribed. In a case study, they are each used to determine a represen¬ 
tative weight for 150 g yoghurt cups in Berlin, at the point of sale 
and within a specific time. In the first sampling stage, food markets 
are randomly selected, while in the second stage, yoghurt cups in 
these food markets are sampled. The sampling methods are appli¬ 
cable due to newly available internet data. These data sources and 
their shortcomings are described. 

Results. The random sampling procedure yields representative es¬ 
timates, which are compared to figures for market leaders, i.e. 
yoghurt cups with very high occurrence in the supermarkets. While 
single types of yoghurt cups showed moderate uncertainty, repre¬ 
sentative estimates were highly precise. 

Discussion results show, for one, the performance of the applied 
statistical estimation procedures, and they show further that add¬ 
ing more information in the estimation procedure (on the shape 
of the cup, on the type of plastic, on the specific brand) helps 
reducing uncertainty. 

Conclusions. As conclusions, estimates and their uncertainty de¬ 
pend on the measurement procedure in a sensitive manner; any 
uncertainty information should be coupled with information on 
the measurement procedure, and it is recommended to use statis¬ 
tical sampling in order to reduce uncertainty for important pa¬ 
rameters of an LCA study. 


Recommendations and Perspectives. Results for market leaders 
differed considerably from representative estimates. This implies 
to not use market leader data, or data with a high market share, 
as substitute for representative data in LCA studies. Statistical 
sampling has been barely used for Life Cycle Assessment. It turned 
out to be a feasible means for obtaining highly precise and repre¬ 
sentative estimates for the weight of yoghurt cups in the case study, 
based on empirical analysis. Lurther research is recommended in 
order to detect which parameters should best be investigated in 
LCA case studies; which data sources are available and recom¬ 
mended, and which sampling designs are appropriate for differ¬ 
ent application cases. 

Keywords: Berlin; empirical data sampling; functional unit; rep¬ 
resentativeness; sampling design; statistical sampling; stratified 
sampling; uncertainty; yoghurt cups 


1 Representative Data 

Life Cycle Assessments need representative data for draw¬ 
ing well-founded conclusions about their object of study. For 
inventory datasets, several industrial branches in the EU have 
undertaken the task to provide data sets for 'their' processes 
that are representative for the industrial branch. Some LCA 
studies try to get representative datasets by considering large 
samples, i.e. high market shares; sometimes, high market share 
is quoted as indicator for good representativeness 1 . 

In order to obtain representative inventory data and conclu¬ 
sions, surveys [IAI2003, FEFCO 2003, Boustead 2003] and 
expert judgement are often applied; the use of statistical sam¬ 
pling is not yet reported. 

Statistical sampling and measurement, however, is a means 
to obtain truly representative data. The effort is often com¬ 
parable to other approaches, or even lower. A classical ex¬ 
ample for the superiority of statistical sampling dates from 
the US presidential elections in 1936. In order to estimate 
who would win, two 'sampling methods' were performed, 
independently. Dr. Gallup conducted 3,000 interviews, with 
interviewees selected via (random) quota sampling of eli¬ 
gible voters, and predicted from these a victory of President 
Roosevelt over Landon, a republican, with 54% vs. 46%. 


* ESS-Submission Editor: Seungdo Kim, PhD (kimseun@msu.edu) 

1 "Data for the Life Cycle Survey were obtained from: 82 world-wide aluminium 
electrolysis plants producing 14.7 million metric tons of primary aluminium, 
representing about 60% of world-wide aluminium smelting operations 
(base: primary aluminium from WBMS 24,464,400 t)." [IAI 2003, p. 4] 
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The 'Literary Digest', a popular magazine at the time, polled 
10 million people selected from automobile registration lists 
and telephone directories, evaluated 2.4 million surveys, and 
predicted a victory for Landon (57%). Actually, Roosevelt 
won the election by 61 %! 3,000 interviews yielded a better 
result than 2.4 million surveys; seemingly the smaller sample 
reflected the eligible voters to a higher extent (consider that 
in 1936, telephones and cars were still predominantly owned 
by the middle and upper classes). 

In Life Cycle Assessments, statistical sampling is barely ap¬ 
plied 2 ; it appears often impractical. One cannot force, for 
example, a company to provide process data merely because 
this company was selected in the sampling procedure. How¬ 
ever, not all aspects of an inventory are critical for its repre¬ 
sentativeness. There are several options in sampling design 
which allow tailoring the design to meet specific challenges 
regarding data availability. And not least, availability of data 
has increased dramatically in the past months, due to legis¬ 
lation and business activities (e.g. f 1-2,4]). 

Among all the different parameters of an inventory system, 
the functional unit (f.u.) is often of prime importance [Kim 
Dale 2006, Hischier Reichart 2003, Cooper 2003]. It is like a 
pivot for the whole LCA model. Changes in the f.u. are di¬ 
rectly reflected in LCA results, so, e.g., in a linear LCA model 
a 10% increase in f.u. means a 10% increase in calculated 
environmental impact. In comparative assertions, even the 
ranking between alternatives might change if f.u.’s of the com¬ 
pared systems do not change in a corresponding manner. 

Any LCA that strives for representativeness thus should strive 
for representative data for the functional unit. The remain¬ 
der of the article describes, for yoghurt cups, a procedure to 
obtain truly representative data by statistical sampling. 

2 Background: Statistical sampling theory in brief 
2.1 The idea of statistical sampling 

Statistical sampling has developed into a credited field of 
science with broad practical experiences. Today it is applied 
in product testing, in political surveys, as well as in health 
science. This paper has neither room for a full introduction 
into sampling theory nor do the authors feel this would be 
their prime task. Instead, we provide some basics as neces¬ 
sary for the understanding of terminology and concept of 
the article. For interested readers, Cochran [Cochran 1977], 
and also [Sudman 1976], [Thompson 2002], and [Schwarz 
1975] are good introductions. 

Statistical sampling is a technique to collect data when little 
resources are available or when the overall population 3 is not 
completely accessible in an efficient and reliable manner 
[Cochran 1977, p. 3]). Sampling means, basically, 'drawing' 
a number of n elements from a population of amount N, n<N. 
Depending on the composition of the population, different 
approaches are used to obtain representative samples. 

A sample is considered representative if and insofar its com¬ 
position 'represents', or reflects, that of the population. If 


2 An exception is the LIME project: [Itsubo Inaba 2003] applied sampling 
in a survey for the representative Japanese population for monetarising 
environmental endpoints. 


the sampling procedure is designed in a satisfying manner, 
results can be as good as if the whole population was 
analysed. In some cases, results are even better as the dili¬ 
gence for a single item can be enhanced. Any figures of in¬ 
terest are calculated from the sample and generalised in or¬ 
der to obtain figures for the target population by estimating 
them from the sample. Typically, these figures are 'average' 
and ‘spread’ of numerical properties of an object of study. 

As an example, the object of study could be yoghurt cups of 
150 g yoghurt, and the calculated parameters the average 
weight and the standard deviation (or variance) of the weight 
of the cups, for the targeted population. 

A sound design of the sampling procedure and an adequate 
estimation function is essential for the quality of the esti¬ 
mate. A first step is to clearly define the population, which 
should, in turn, be determined by goal and scope of the study. 

2.2 Sampling designs 

There are several choices in sampling design; properties of 
the population, the characteristics to be sampled, and, not 
least, the desired sampling precision and affordable costs 
and effort, will determine the sampling design that fits best 
for a study. 

Simple random sampling is the easiest sampling design. From 
a population with N elements n elements are selected ran¬ 
domly without replacement so that each element in the popu¬ 
lation has the same chance of being drawn. In many cases, a 
further division of the population reduces the variance of 
the estimate and will thus improve its quality. If the popula¬ 
tion is rather heterogeneous and may be split into subsets 
which are more homogenous, each, the variance from a 
simple random sample of the whole population will be higher 
than necessary. More homogenous subsets will have a lower 
variance, respectively. Now, if the subsets can be assumed 
to be independent, which is often the case, then the overall 
variance is simply the sum of the variances of the subsets. In 
an extreme case, with completely homogenous subsets, their 
variance is zero, and the resulting overall variance is calcu¬ 
lated to zero as well. The subsets are called strata, one sub¬ 
set a stratum. Overall the 'stratification' has a positive ef¬ 
fect if the calculated variance is much lower than if calculated 
directly, without stratification. In stratified random sampling, 
one will take a simple random sample from each stratum 
[Cochran 1977, pp. 89]. 

Imagine yoghurt cups of different shape, broad and narrow 
ones. The shape will influence the weight of the cup; it can be 
observed easily and thus suits as an indicator for the defini¬ 
tion of homogenous and disjoint (independent) strata. Calcu¬ 
lating the mean of the weight within these strata of cups of 
the same shape will produce a precise estimate, for each shape, 
with only a few measurements in each stratum. If the shares 


3 'population' is the term used in statistics to describe the set where the 
sample is taken from [Sudmanl 976, p. 11 ]; in a narrow sense, this popu¬ 
lation is called the sampled population. The sampled population should 
match the population about which information is wanted, the population 
that is targeted, the target population. [Cochran 1977, p. 5]. Since a sam¬ 
pling study attempts to generalise results from the sample to the target 
population, the latter is also called 'the universe' [Sudman 1976, p. 12]. 
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of cups of different shapes in the overall population are 
known, then a more precise mean for the population can be 
calculated by combining the estimated means from the strata. 

A drawback is that in order to know the shares (of each 
type of cup, e.g.) and to be able to draw from each stratum, 
a reliable and complete list of the elements and their at¬ 
tributes in the population of interest is needed, which is of¬ 
ten not available, or at least time and cost consuming to 
obtain. For example, one would need a complete list of yo¬ 
ghurt cups and their shapes in all supermarkets in Berlin. 

In contrast, a list of 'higher-level entities' (a list of supermar¬ 
kets, for this case) is often available. This calls for cluster 
sampling, a different sampling design. The idea is to group 
the elements in the population in classes, which are called 
clusters, and then perform a random sampling on these clus¬ 
ters and determine all elements in the selected clusters in a 
second step. In the example above, one would first perform 
random sampling on the 'supermarket list' and afterwards 
determine all the yoghurt cups for the selected supermar¬ 
kets. Quite contrary to stratified sampling, where the strata 
have to be as different as possible and the elements within 
the strata as similar as possible, the clusters in cluster sam¬ 
pling have to be as similar as possible and the elements within 
the clusters as different as possible. 

And if the property of interest (the weight of a yoghurt cup) 
is obtained only with some effort, sampling on large popu¬ 
lations might become tedious. In this case, a second sam¬ 
pling step (that 'samples from the sample') is often conve¬ 
nient. This type of design is called a two-stage sampling 

The sampling techniques described here represent only a 
small extract of possible sampling designs. Depending on 
the task, other designs and combinations might be adequate. 
For a description of further designs, with emphasis on vari¬ 
ance estimation, see e.g. [Wolter 1985]. 

3 A Case Study: 

Uncertainty in the weight of yoghurt cups in Berlin 
3.1 Aim of the exercise 

Aim of the case study is to provide a precise and representa¬ 
tive estimate for the weight of plastic yoghurt cups for 150 g 
yoghurt at point of sale in Berlin, in food markets, available 


for consumers on one specific day. The precision shall be 
expressed as standard deviation. In an exploratory data 
analysis step, uncertainty for different types of yoghurt cups 
shall be determined. The sampling design needs to take into 
account how yoghurt cups are presented to customers in 
Berlin. Different designs shall be evaluated. 

3.2 Inclusion of the marginal consumption in a 
comparative LCA 

The population are all yoghurt cups that contain 150 g yo¬ 
ghurt which are presented to consumers in supermarkets 
and other food markets in Berlin, on a specific day. The type 
of yoghurt (natural yoghurt, with fruit) may vary. The cups 
must be made from plastics; compound material (carton, 
paper, and plastic) is excluded from the analysis. 

One type of yoghurt in one market is considered as one ele¬ 
ment in the population, several cups of the same type in one 
market are not distinguished, while the same type in an¬ 
other market is another element in the population. 

The number of the different types of yoghurt cups in differ¬ 
ent markets, in other words the number of elements N in 
the population, is not known beforehand. 

Thus applying a two-stage sampling design is reasonable. In 
the first stage, the population is the number of supermar¬ 
kets and food markets in Berlin. From this population, which 
will be called 'market population' in the following, repre¬ 
sentative markets are selected by stratified random sampling, 
and the number of different types of yoghurt cups is deter¬ 
mined within this sample, the 'market sample'. This is the 
first sampling step. 

From the market sample, an estimator is available for the 
number of different yoghurt cup types in Berlin. This is an¬ 
other population, which will be called 'cup population'. Its 
elements are the different types of yoghurt cups available in 
Berlin at point of sale, and at the time of sampling. For the 
selected markets, both cluster sampling and stratified sam¬ 
pling was applied. An a posteriori stratification was applied 
to the cluster sampling in order to investigate further pos¬ 
sible improvements of the estimator (Fig. 1). 


First Stage 


Second Stage 


stratified sample of 
markets 


cluster sample of 
yoghurt cups 


a posteriori stratified 
sample of yoghurt 
cups 


stratified sample of 
yoghurt cups 


'_i_ : 


Fig. 1: Sampling designs applied in the case study 
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3.3 Sampling procedure 

In the first stage of sampling, for the market population, it is 
assumed that the type of yoghurt cups is similar for super¬ 
markets of the same brand. Accordingly, the brand of the 
supermarket is used as criterion for defining the market strata. 
For larger brands, each brand is one stratum. Smaller brands 
with less than 40 markets were grouped into one stratum. 

The number of different markets in Berlin is easily available 
via internet research. Yellow pages [1] and Google Maps [2] 
provided, in combination, an overview of markets and their 
brands and location. Overall 1215 markets were found and 
compiled into a list. The strata and the aggregated number 
of markets are given in Table 1. Related brands that belong 
to a single company but are intended for different types of 
customers (e.g. budget-priced vs. regular range of goods) 
were treated as different brands. 

Based on this list, stratified random sampling was performed 
with units proportional to size; the probability of a market 
to be drawn being proportional to the number of markets of 


Table 1: Number of food markets in Berlin, per market brand 


Market 

stratum 

Market 

Number of markets in 

Berlin city 

1 

Aldi 

183 

2 

Kaiser's 

173 

3 

Plus 

162 

4 

LIDL 

93 

5 

SPAR 

86 

6 

Penny 

72 

7 

EDEKA 

71 

8 

Rewe 

63 

8 

miniMal 

31 

9 

Reichelt 

54 

10 

Netto 

43 

11 

Meyer&Beck 

41 

12 

extra 

35 

12 

Butter-Lindner 

31 

12 

Kaufland 

17 

12 

Norma 

17 

12 

Karstadt 

13 

12 

Real 

10 

12 

Bio Company 

9 

12 

Kaufhof 

5 

12 

Ullrich 

3 

12 

Birlik 

1 

12 

K-Markt 

1 

12 

LPG BIO Markt 

1 

Z 


1215 


this brand. The sampling ratio was set to 3% (3% of each 
stratum was drawn), hence elements from larger strata are 
more frequent in the sample. Overall, 35 markets were ran¬ 
domly selected. 

The selected markets were visited and the yoghurt cups in 
these markets inspected, noting properties of all yoghurt cups 
that belonged to the population (excluding e.g. cups made 
from cardboard). Four of the drawn markets were found to 
be out of business. Three markets were added as replacements. 

For the cluster sample sixteen markets of the first stage were 
randomly selected and all of the respective yoghurt cups in 
these markets were analysed. Markets in the cluster sample 
were located off-centre, making them more difficult to visit. 
If the location of a market had any influence on the weight 
of the yoghurt cups, a bias in the cluster sample would be 
introduced. This was assumed to be not the case, which was 
tested later on. The following attributes were thought to 
influence the yoghurt cup weight in the population: (i) type 
of plastic (PP, PS); (ii) colour and opacity (clear, white); (iii) 
width (small, wide), and (iv) overall shape (cups with foot 
and without foot). 

These attributes were used for defining the strata which re¬ 
sulted in 16 different ones for the stratified yoghurt cup 
sample. Table 2 shows the strata and the 'assigned' yoghurt 
cups together with the characteristics of each stratum. 

In a next step, random sampling was performed, per stra¬ 
tum, with a sampling ratio of 20%. A yoghurt cup in one 
stratum thus has a chance of 20% to be drawn. 

Fig. 2 shows the markets in the sample, the number of dif¬ 
ferent types of 150g yoghurt cups offered to customers in 
each market, and the cups drawn for the stratified sample 
and the cluster sample. 

All of the respective cups in cluster and stratified sample 
were bought, emptied, any labels were carefully removed, 
and the empty cups were weighed on a high precision, labo¬ 
ratory balance 4 . Data was entered into an MS ACCESS data¬ 
base, checked for errors, and further analyses were performed 
by using the statistical open source software package R (Ver¬ 
sion 2.3.1) [3] and spreadsheet software. 

3.4 Uncertainty in sampled data 

The following figures show the uncertainty for all yoghurt 
cups, for two selected types of yoghurt cups, for all sampled 
clusters, for all sampled strata, and, finally, a comparison of 
the uncertainty for the overall sample, for one cluster, for 
one stratum, and for one type of yoghurt cup. 

For all types of yoghurt cups, the weight varies between less 
than four and more than eight grams. The shape of the histo¬ 
gram resembles, slightly, a log normal probability distribution, 
with two peaks (maximum frequency) at around 5 and 6 g 
which might indicate an overlap of two different distribu¬ 
tions 5 . The overall sample is a blend of the weights of many 

4 Weighing machine: OMNILAB OL 210-A, max 220 g ± 0.0001 g, from 
NovaBiotec laboratory, Berlin. 

5 A histogram is a graphical representation of tabulated frequencies; it is 
often used to explore the shape of data distributions. 
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Table 2: Yoghurt cup strata in the sample, their characteristics and respective yoghurt brand names (note that the stratum nr. is different from the one 
shown in table 1; PS: Polystyrene) 


Stratum 

no. 

PS? 

Clear 

Plastic? 

Narrow? 

Foot? 

Yoghurt brand names 

(in some cases description of shape is added) 

6 

Yes 

No 

Yes 

No 

Alpa Frucht Yoghurt; Alpa Joghurt pur 4*150; Berchtesgadener Land Bioghurt; 

Biac probiotischer Joghurt 4*150; Bio Wertkost Bio Joghurt; BioBio Fruchtjoghurt; 

BioBio Naturjoghurt; Bissou 4*150; demeter Biogurth; Ehrmann Almighurt; 

Ehrmann Genuss Diat; Fruchtjoghurt mild 4*150; Fruchtjoghurt mild 8*150; 

Grazil Fruchtjoghurt 4*150; Herzgut Fruchtjoghurt; Martinshof Ziegenjoghurt; 

Mibell Joghurt PS; Milram Magermilch Joghurt; Mondelice Fruit Split 4*150; 

Muller Joghurtschnee; Naturell 3.5% Fett; nom l.free; Rogge Bio Joghurt; 
Sachsenmilch Fruchtchen; Sobbeke; Yogosan Fruchtjoghurt 4*150; 

Bissou Joghurt auf Frucht 

8 

Yes 

No 

No 

No 

Grazil Joghurt; Sachsenmilch Vanillezauber; Yoganic 0; yes wide 

9 

No 

Yes 

Yes 

Yes 

Elite Joghurt mild (FufB) 

10 

No 

Yes 

Yes 

No 

Campina Fruchtstrudel; Dr. Oetker Jobst; Elite Vanilla auf Frucht (clear); Gut & fein Joghurt 

auf Frucht; ja! Joghurt mild (clear); Muller Froop 

13 

No 

No 

Yes 

Yes 

Elite Sahnejoghurt mild (foot); Gut & Gunstig Sahnejoghurt; Mertinger Sahnejoghurt; 
Sahnejoghurt; Tipp Sahne Joghurt mild; Zott Sahnejoghurt 

14 

No 

No 

Yes 

No 

Bauer Joghurt mild; Bioness Bio Fruchtjoghurt; Campina Milchreiter; Campina Optiwell; 
Elite Joghurt mild; Elite Karamell/Vanilla; Elite Vit Balance; Erlenhof Joghut mild; 
Goldblume Joghurt mild; Gropper probiotic Jogurtcreme; Gut & fein Joghurt mild; 

Gut & fein Joghurt natur 4*150 ; ja! Joghurt mild; ja! Magermilchjoghurt; 

Mark Brandenburg Joghurt mild; Mibell Joghurt; Milbona Sahnejoghurt; Pro Jogo 4*150; 
proactiv diat; Tipp Joghurt Creme; Tipp Joghurt mild; Yoganic 0.1% (narrow); Zott Jogole 

16 

No 

No 

No 

No 

Bauer Yogorande; Choco Picnic Joghurt + Schokoraspel; Elinas Joghurt nach griechischer 
Art 4*150; Hofmaier Cream-Jogh.; Hofmaier Joghurt & Chocosplits; Landliebe Joghurt; 
Minus L Joghurt; Movenpick Feinjoghurt; Yocous; Yogosan Edelrahmjoghurt 



□ Element not drawn in the stratified sample (stage 2, type of yoghurt cup that is noted but not weighed) 
O Supermarket in the cluster sample 


Supermarket drawn in sample stage 1; each square represents one type of yoghurt cup 


Fig. 2: Location of the supermarkets in the sample, the number of yoghurt cups in each market (squares), and the yoghurt cups in the stratified sample 
(greyed squares) and the cluster sample (circles) 
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All yoghurt cup weights 



Weight per cup 


Results from a single cup type (Ehrmann Almighurt) Results from a single cup type (Zott Jogole) 



Fig. 3: Histograms for the overall sample and for two different yoghurt cups, provided as examples 


different types of yoghurt cups. For each type, the variation discount supermarkets which offer only a few different types 

in weight is lower than for the overall sample and the histo- of yoghurts. The other clusters have a high variation per 

grams often resemble normal or lognormal probability dis- cluster, but seem rather similar regarding quartile range, and 

tributions. In Fig. 3, this is shown for two specific cup types in to some extent also regarding to the mean (the thick black 

comparison to the histogram of the overall sample. line in each box). The strata, on the other side, clearly differ 

The coefficient of variation, calculated as the ratio of em- from one to another ’ and have a mudl lower variation per 

pirical standard deviation to mean, per type of yoghurt cup, stratum , (see Fl S' r 5 )- Uncertainty in strata and clusters thus 

lies in the range from almost zero to eight percent, with the reflects the aim of the sam P hn S (homogenous strata, cluster 

majority between one and three percent (Fig. 4, see Annex, according to the population, see section 2). 

p. 277). This can be interpreted as the relative uncertainty 

in the sampled data. 3.5 Calculation procedure for a representative estimate 

The cluster sample is shown in box plots in Fig. 5, one clus- Formulas for the calculation of estimates in stage 1 and stage 

ter being one supermarket. Two clusters (the two on the left 2, for stratified, cluster and a posteriori stratification as well 

side of the figure) have quite narrow samples; these are two as for calculating uncertainty are given in the Annex, p. 276. 
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Results from the cluster sample 


Results 

from the stratified sample 



Fig. 5: Boxplots for the cluster sample, per cluster, and for the stratified sample, per stratum 


3.6 Results 

The following results are achieved with calculation formu¬ 
las as described in section 3.5. They refer to the estimated 
mean and variance of the weight of one 150 g yoghurt cup 
in Berlin, representative for the day of sampling: 

Stratified sampling: 

— iiV _ 

^ = zl ~ 5.6474g ^ w ith a variance of 

h =1 N 


/V Jl_ lrl 

Var(7) = £ 


2 AT 2 

N, - n, s h 


= 0.0034 


/V 

With: Y cl qs Estimate for the mean, clustered sampling 

The variance is equivalent to a standard deviation of 0.085, 
and to a coefficient of variation (relative uncertainty, stan¬ 
dard deviation divided by the mean) of 0.015 or 1.5%. 

A posteriori stratification of the cluster sample: 

Y = 5.7245g and Var(Y) = 0.0014- 

/V 

With: Y Estimate for the mean 

The estimated variance is even lower than in the cluster sam¬ 
pling; it corresponds to a standard deviation of 0.037, and 
to a variation coefficient of 0.0065 or 0.65%. 


With: 

Y = Estimate for the mean 
h = Index number of stratum 

N = Number of food markets in Berlin (population 1) 

N h = Number of food markets of stratum h in Berlin 
M = Number of strata 

n h = Number of food markets of stratum h in the sample 
y h = Average number of types of yoghurt cups in a food 
market in stratum h 

s h 2 = Variance of the number of types of yoghurt cups in 
stratum h 

The uncertainty in the population is calculated to s 2 = 0.72, 
the variation coefficient to 15%. 

Clustered sampling: 

/v yv 

Y cl ,q S - 5.6589 g , with a variance of Var(Y CL QS ) = 0.0072. 


3.7 Discussion 

3.7.1 Weaknesses of the study 

Several weaknesses of the study are worth being mentioned. 

The weight of the cups could be influenced by remaining 
yoghurt in the cup. It could also be influenced by water ad¬ 
herent either to the walls of the cup or to paper etiquettes. 
We aimed to minimise these effects by thoroughly cleaning 
the cups, by drying them at room temperature, and by re¬ 
moving any paper etiquettes. Cups made from a combina¬ 
tion of paper and plastic were excluded, see section 3.2. 

Some shops from the original sample were closed. In the 
sampling procedure, they were replaced by other similar 
markets. Obviously non-existent markets cannot be discov¬ 
ered for the overall sample, i.e. for those markets that have 
not been visited. In the sample, the share of closed or re¬ 
placed markets was about 10% (4 of 35). This figure may 
influence the result (the estimates) if 'new' markets had a 
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Fig. 7: Range of yoghurt cup weights in the sample, and estimated means of the three sampling techniques. Confidence intervals for the estimated mean 
are too small to be visible in the graphic 


different yoghurt product portfolio from markets that left 
business. We have, at the moment, no such indication. In 
any case a 10% rate is a sign that the basic data does not 
match the speed of market transformation. 

In the shops, the cups were selected 'one per type', without 
specific sampling pattern for each type. Individual sampling 
patterns might have existed and could lead to a biased selec¬ 
tion of cups. 

3.7.2 Sampling design 

In order to analyse the effects of the sampling design, cluster 
sample and the stratified sample are tested in a Chi-Square 
test 6 . Null hypothesis for the test is: The weight of the yoghurt 
cups in the stratified sampling obeys the probability distribu¬ 
tion of the weight of the yoghurt cups in the cluster sample. 

As a result, with a significance level of 5%, and 5 classes in 
the test, the test statistic X 2 is calculated to (X 2 = 1.3572) 
< (X 2 (0.95,4) = 9.4877) . If this inequality was not fulfilled, 
the null hypothesis would need to be rejected, which is clearly 
not the case here. Hence, it seems fair to assume that both 
stratified sampling and cluster sampling have identical prob- 


6 A Chi-Square (X 2 ) test is a standard test for comparing two different value 
series, see e.g. [Sachs 1992, pp. 420]; tests are usually conducted by 
stating a null hypothesis ('these two data samples have the same distri¬ 
bution 1 ; 'all dogs are black bulldogs'. Then, a test criterion (or several 
criteria) are determined to test whether this hypothesis can be accepted 
or not, based on available (sample) data. For details please refer to stan¬ 
dard statistic text books, e.g. [Tabachnik Fidell 2006]. 


ability distributions. For further illustration, Fig. 6 (see An¬ 
nex, p. 277) shows a histogram of cluster and stratified sam¬ 
pling. Both distributions have about the same shape which 
supports the test results. One can thus assume that cluster 
samples and population follow the same probability distri¬ 
bution, which is also a confirmation of the quality of the 
cluster samples. 

On a more intuitive basis, Fig. 7 shows the overall estimator 
for the mean, the confidence intervals within four different 
strata, and the confidence interval for the type of yoghurt 
cup with the highest share in each shown stratum. Obvi¬ 
ously, results per stratum differ considerably, and often cups 
with the highest share differ considerably from the stratum 
where they belong to (stratum 6 and 14). In contrast to this, 
the estimated representative mean is for all sampling meth¬ 
ods similar to an extent that its differences do not show in 
the figure, and the estimator is precise to an extent that the 
confidence interval does not show, either. 

Fig. 7 shows also that all sampling techniques lead to very 
similar results. For each of the designs applied, the result is 
very precise, with a very small variance leading to very small 
confidence intervals. 

3.7.3 Representativeness 

A demand for representative results is often intended to be 
met by expert judgement alone, or by using parameter val¬ 
ues that represent a high market share, instead of applying 
statistical sampling. While expert judgement seems rather a 
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Fig. 8: Comparison of the market leaders and of representative products 


fall back option (used if no further information is available), 
the reason of high market share deserves attention. 

Fig. 7 shows how different the mean of the population (here: 
one stratum as a set of yoghurt cups with identical shape) 
may be from the mean of its most common elements. 

This bias may have severe consequences in a comparative 
assertion, which is demonstrated in Fig. 8. Stratum 6 and 
stratum 14 (white plastic, narrow shape, without foot) dif¬ 
fer only in the type of plastic (PP vs. PS). The most common 
yoghurt cups, the 'market leaders', have a share of 31 and 
30% in the respective stratum. In the case of PS, the market 
leader is much lighter than the average, while for PP, it is, on 
the contrary, much heavier. The average weight, and even 
the confidence intervals, for both strata lie in the same range. 
Thus, a comparison based on market leaders, or a high mar¬ 
ket share, will produce, in this case, a completely different 
picture than a comparison based on representative samples. 
The difference is considerably high, 4.9 to 6.1 g for the mar¬ 
ket leaders, or about 25%. Note that these are differences 
in the weight of the functional unit which will, in a linear, 
common LCA models, directly show in the result. Consider¬ 
ing only the products with the highest market share would 
thus lead to a fatally wrong conclusion 7 . 


7 Case study results in this paper reflect the market share per different types 
of cups offered to customers, and not per unit sold. Since some cups are 
offered in packed units of four or eight, both will be different, and thus 
should be discerned. However, it seems likely that the principle discov¬ 
ered on the basis of the empirical data in this paper, namely that market 
leaders might differ from the rest of the market in study-relevant aspects, 
will hold also for a market share that is based on the number of units sold. 


Statistical sampling approaches offer a possibility to come 
to representative data; they, on the other hand, demand a 
design tailored to meet data availability challenges in order 
to be applicable. 

3.7.4 Uncertainty 

The case study provides empirically based uncertainty in¬ 
formation for the functional unit as one prime parameter in 
an LCA study. 

When expressed as coefficient of variation, the uncertainty is 
15% for the overall population; for single types of yoghurt 
cups it is in the area of one to three percent, with two outliers 
of about eight percent. For the estimated mean, this uncertainty 
can be reduced dramatically, to 0.07% for the stratified sam¬ 
pling. This is an effect of the sampling procedure, and of addi¬ 
tional information taken into account by the sampling proce¬ 
dure (shape of the yoghurt cup, material, and so on). 

For quantitative figures, the results indicate that an LCA on 
yoghurt packaging which considers the brand name will face 
a relative uncertainty (coefficient of variance) between one 
and three percent from the functional unit alone. This is, of 
course, not much. Is it then relevant? There is no easy an¬ 
swer to this question. 

The functional unit is the first quantitative datum in an LCA 
calculation. Many other will follow. These 'following data' 
are of course also more or less uncertain. In the most basic 
case, an LCA product system is a linear chain of processes. 
A relative uncertainty of 2 percent in each product flow 
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yields, after 7 processes, an overall uncertainty (propagated 
uncertainty plus uncertainty in each flow) of about 15% 8 . 

In any case study, the impact of uncertainty 9 will depend on 
the structure of the product system; in comparative asser¬ 
tions, it will also depend on amount and location of uncer¬ 
tainties in the compared product system, and on its struc¬ 
ture. Quite often, similar processes in both compared systems 
are omitted; this might either reduce sources of uncertain¬ 
ties but might also lead to a higher relative uncertainty in 
the result. These topics are not completely solved but have 
been discussed in earlier literature, e.g. [Ciroth 2001], [Ciroth 
et al. 2004], [Huijbregts 2001]. 

3.7.5 Similar approaches in marketing 

In marketing, one often encounters the need to control the 
effect of a marketing campaign, or, more basically, to learn 
more about consumer behaviour, or about competitors. Quite 
often, yellow pages, web directories, and similar informa¬ 
tion is consulted in these cases ('the Yellow Pages as a mar¬ 
ket research tool', [Gross et al. 1993, p. 147]; 'Marketers 
should regularly and systematically utilize numerous sources 
for competitive evaluation', states Weinrauch, [Weinrauch 
1987, pp. 18], listing about 25 different sources. While of 
course goal and scope of marketing is not fully in line with 
those of a Life Cycle Assessment, the need to analyse, care¬ 
fully, the circumstances in which the product is used, and 
how the product exactly looks like at the point of sale, is 
very comparable between both disciplines. This is obviously 
only an analogy at first glance which needs to be explored 
more in detail, regarding the applied tools (for marketing, 
e.g. [Fitzroy 1976]) and their performance. 

4 Conclusions and Perspectives 

Yoghurt cups and other plastic food packaging have often 
been the subject of Life Cycle Assessments or similar analy¬ 
ses, e.g. [IFEU 2003], [Petcore 2004], [IFEU 2006], 
[Keoleian et al. 2001]. The functional unit, in each case, 
will often depend on the weight of the cup in a linear man¬ 
ner. In consequence, differences in the weight per packed 
volume in, e.g., five percent between two alternatives will 
lead, ceteris paribus, to a difference in impact assessment 
results of five percent as well (assuming again a linear LCA 
model). Thus there is a clear need for reliable and precise 
information for the parameter 'weight', especially in com¬ 
parative assertions. 

The sampling of yoghurt cups, in the case study, provided 
precise and representative estimates, based on empirical in¬ 
vestigation. The effort was manageable. Three different, multi¬ 
stage sampling designs were tested; all of them provided pre¬ 
cise estimates of the mean weight of a yoghurt cup. The cluster 
sampling had practical advantages in the case study. 


8 (100% + 2 %) 7 = 115%; for the simplest case, assuming that uncertain¬ 
ties are independent. 

9 'Impact of uncertainty 1 meaning here: how will the quantitative results, 

and ranking, and conclusions drawn from the study be affected from 

uncertainty. 


The results clearly demonstrate that high market share does 
not at all ensure representativeness. Instead, products (yo¬ 
ghurt cups) with high market shares often were very differ¬ 
ent from a representative average. The results indicate, in 
addition, low uncertainties (standard deviations) for most 
cups, and moderate uncertainties for single types of yoghurts. 
Using more, available, information in the sampling proce¬ 
dure and in the calculation of estimates for the mean re¬ 
duced the uncertainty in the estimates considerably, yield¬ 
ing highly precise estimates. 

This demonstrates that more information reduces uncer¬ 
tainty, or, in other words, observed uncertainty is the result 
of changes in data that cannot be explained or 'understood' 
otherwise. A change in the shape of a yoghurt cup often 
influences its weight; if the shape is known, the uncertainty 
in the estimated weight will decrease. If the brand name is 
known in addition, this will, with a specific sampling plan, 
help to further reduce the uncertainty 10 . 

Further, this indicates that 'uncertainty' is not a specific, fixed 
characteristic for a yoghurt cup, as would be the type of 
plastic; rather, uncertainty of the weight depends on two 
main parameters: 

First: What is the precise object of study, what is the 'popu¬ 
lation' in terms of statistical sampling? '150 g yoghurt cups' 
would be too unspecific for the case. The specific type of 
yoghurt cup, time, and geographical scope are relevant ad¬ 
ditional aspects; 

Second: How does the measurement procedure look like? A 
smart sampling procedure can reduce uncertainty. 

It seems fair to assume that these two parameters hold for 
all quantitative data in LCA studies. Somewhat surprisingly, 
both are fully covered by the definition of a functional unit 
according to ISO 14040 11 . However, the measurement pro¬ 
cedure is rarely addressed in LCA literature (examples are 
the Cascade project, [Cascade 2003]). The measurement 
procedure for uncertainty is addressed even less often. One 
of the few examples is [Sugiyama et al. 2005] who discuss 
how to obtain probability distributions from survey data. 

We conclude that the measurement procedure for uncertainty 
shall be mentioned if uncertainty information is provided 
with quantitative data in LCA. Further, we conclude that 


10 This interpretation is in line with classical information theory by Shan¬ 
non: Shannon defines with H the information entropy, which is "a mea¬ 
sure for [...] how uncertain we are at the outcome [of selecting several 
events] 11 [Shannon 1948, p. 10]. For two events x, y holds: The informa¬ 
tion entropy of x and y is equal or lower than the entropy for y alone. 
Shannon: "The uncertainty of y is never increased by knowledge of x. It 
will be decreased unless x and y are independent events, in which case 
it is not changed". If we define measuring weight and other properties of 
yoghurt cups as events, then, according to Shannon, the uncertainty of 
the weight will decrease by knowledge of another property of the yo¬ 
ghurt cup unless the property does not influence the weight. 

11 While this statement is evident for the specification of the object of study, 
it is also true for the specification of the measurement procedure. ISO 
14040, section 5.2.2, Function, functional unit and reference flows: "The 
functional unit defines the quantification of the identified functions [...] 
of the product". Quantification implies the measurement procedure and 
the quantified measurement value. 
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analysing market leaders alone is not justified when looking 
for representative data. 

Finally, it seems fair to state that more research is needed on 
appropriate sampling procedures, data sources, and param¬ 
eters of LCA product systems which should undergo statis¬ 
tical sampling. Emerging new data sources, with acceptable 
accuracy, will foster the application of statistical sampling 
methods which are, when applicable, clearly superior to 
expert judgement and to analysing market leaders alone. 
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Annex 

3.5 Calculation procedure for a representative estimate 
3.5.1 Stage 1: Food markets in Berlin 

For the stratified sampling in the first stage, an estimator for 
the average number of different types of yoghurt cups in one 
market is calculated as follows [Cochran 1977, pp. 91b 




with the variance of this estimator: 


jl1 m 2 2 1 2* , _ v> 

V(Y) = with s > = z—-yj 

N- ^ n h ~ U*=i 

with: 

N = Number of food markets in Berlin (population 1) 

N h = Number of food markets of stratum h in Berlin 
M = Number of strata 

n h = Number of food markets of stratum h in the sample 
y = average number of types of yoghurt cups in a food mar¬ 
ket in stratum h 

y hk = number of types of yoghurt cups in food market k in 
stratum h 

s h 2 = variance of the number of types of yoghurt cups in stra¬ 
tum h 

These figures Y are, in the first stage of the sampling, the aver¬ 
age number of different types of yoghurt cups in one market. 
This average is calculated to 7.823 types of yoghurt cups per 
market, and the total as 9,504 yoghurt cups of different types 
in Berlin, for the sampling day, which is also the number of 
elements in the population for stage two. 


3.5.1 Stage 2: Cluster sampling 


For the cluster sampling in stage 2, the weight of the yoghurt 
cups is estimated via [Cochran 1977, pp. 2501, [Kauermann 
2006, pp. 891 



with the sum for the cluster: 
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with: 


N = 
N h = 

n h = 
a hk - 

Phk = 
Pl = 


number of food markets in Berlin 

number of food markets in (market) stratum h in Berlin 
number of types of yoghurt cups in the sample in the 
market stratum h 

number of types of yoghurt cups in market stratum h 
and yoghurt cup stratum k 

share of yoghurt cup stratum k in market stratum h 
estimator for the share of yoghurt cup stratum k in the 
population 


12 market strata and 16 yoghurt cup strata yield 192 possible 
shares of yoghurt cup strata in food market strata, p hk . Some of 
the shares are zero since not all markets offer all types of yo¬ 
ghurt. For all elements in one yoghurt cup stratum k, N k = P k - 
N holds. Consequently, for all strata, the number of elements is 
calculated with: 
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A proportional sampling was applied (see above). The calcula¬ 
tion of the estimated mean and variance for the weight of the 
yoghurt cups in the population is similar to calculating the esti¬ 
mators in the first sampling stage, for the market sampling: 




with: 

N = number of types of yoghurt cups in the population 

N k = number of types of yoghurt cups of stratum k in the popu¬ 
lation 

n k = number of types of yoghurt cups of stratum k in the 
sample 

y k = average weight of the types of yoghurt cups in stratum k, 
in the sample 

s k 2 = variance of the weight of the yoghurt cups in stratum k, 
in the sample 


The variance of this mean is estimated from: 


Var(Y ( 


1 M-m 

* Tk T 


N M m(m-l)i^ 


rn 

2X, -n,y, 


with: 

M = Number of cluster in the population 
m = Number of cluster in the sample 
N = Average number of yoghurt cup types in a market 
Nj = Number of yoghurt cup types in market 1 


3.5.3 Stage 2: Stratified sampling 

For the stratified sampling in stage 2, one needs to know the 
share of each stratum on the population. The shares are ob¬ 
tained via: 


p _ y N pPbk 
k ^ N 


with 



a hk 

n h 


3.5.4 Stage 2: A posteriori stratification of the cluster sample 

The a posteriori stratification was performed for the elements 
in the cluster sample by using secondary criteria for the defini¬ 
tion of the strata in the same way as they were used in the 
stratified sample, in stage 1. Each element in the cluster sample 
was assigned to a stratum, and the share of each stratum in the 
population was taken from the stratified sampling. Estimators 
for mean and variance are calculated as in the stratified sam¬ 
pling procedure. 


3.5.5 Calculation of uncertainty 

The variances of the estimated means above are a measure for 
the precision of the respective estimators. They do not reflect 
the uncertainty of the population, which is the variation of the 
estimated mean. This variation, in turn, can be expressed as 




i=l 


(y,-Y) 2 
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Muller Froop 
Yoganic 0.1%schmal 
Movenpick Feinjoghurt 
Elite Joghurt mild (FuB) 

Landliebe Joghurt 
Elite Vit Balance 
Sahnejoghurt 
Elite Sahnejoghurt mild (FuB) 

Sachsenmilch Vanillezauber 
Yoganic 0.1% breit 
Milram Magermilch Joghurt 
Elite Karamell/Vanilla 
Ehrmann Almighurt 
Bauer Joghurt mild 
Biac probiotischer Joghurt 4*150 
Ehrmann Genuss Diat 
Zott Sahnejoghurt 
Dr. Oetker Jobst 
Minus L Joghurt 
Grazil Joghurt 
Hofmaier Cream-Jogh. 

Gut & fein Joghurt mild 
Elite Joghurt mild 
Tipp Joghurt mild 
Gut & Gunstig Sahnejoghurt 
Zott Jogole 
Fruchtjoghurt mild 8* 150 
Alpa Frucht Yoghurt 
Sachsenmilch Fruchtchen 
Mark Brandenburg Joghurt mild 
Elinas Joghurt nach griechischer Art 4*150 
Fruchtjoghurt mild 4* 150 
Tipp Sahne Joghurt mild 
Campina Milchreiter 
Elite Vanilla auf Frucht (klar) 

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 

Fig. 4: (see section 3.4 'Uncertainty in samples data 1 ). Coefficient of variation (standard deviation / mean), per type of yoghurt cup. Cups that occurred 
only once in the sample are excluded 


Validation of the Samples 



60 


50 


& 40 

c 

a? 

3 

D - 

LU 

■ 4 = 30 

3 

O 

lA 

£ 20 


10 


0 

i-i f-2k X-ff I X+S i + Sr 



■ lesults of the cluster sample n results trflhe atiatified sample 


Fig. 6: (see section 3.7.2 'Sampling design 1 ). Frequency distribution of the weight according to cluster sampling and stratified sampling s: empirical 
standard deviation from the cluster sample; x: empirical mean from the cluster sample 
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